Hardly a day – hardly a post – goes by here at the Swing State Project without a reference to the Cook Political Report’s Partisan Vote Index, or PVI for short. In the wake of the 2008 elections, SSP’s pres-by-CD project has spurred a lot of discussion about how the PVI is calculated and why it’s calculated the way it is.
Quite a few people people had a hard time believing my explanation of the math behind the PVI. But you don’t have to take my word for it – this is how the Almanac of American Politics explains things:
Cook Partisan Voting Index. Refers to the Partisan Voting Index (PVI) as used by Charlie Cook, Washington’s foremost political handicapper. The PVI is designed to provide a quick overall assessment of generic partisan strength. For this volume, the PVI includes an average of the 2000 and 2004 presidential elections in the district as the partisan indicator. The PVI value is calculated by a comparison of the district average for the party nominee, compared to the 2004 national value for the party nominee. The calculations are based upon the two-party vote. The national values for 2004 are George W. Bush 51.2% and John Kerry 48.8%. The PVI value indicates a district with a partisan base above the national value for that party’s 2004 presidential nominee. Thus a district with an R+15 is a district that voted 15 percentage points (as an average of its 2000 and 2004 presidential vote) higher for Bush than the national value of 51.2%. Similarly, a district with a D+15 is a district that voted 15 percentage points (as an average of its 2000 and 2004 presidential vote) higher for Kerry than the national value of 48.8%. An X +00 indicates an evenly balanced district. (Emphasis added.)
The boldface sentences confirm my understanding of how PVI works. But why should it be calculated this way? I agree with the majority sentiment that it seems to make more sense to compare 2000 district performance with 2000 nationwide performance, not 2004 nationwide performance. This isn’t as big of a deal with the two Bush elections because they were both so close, but comparing Kerry’s 2004 district numbers with Obama’s nationwide numbers produces some pretty serious gaps. I’d be curious to know what sort of justifications or rationales anyone can come up with for the status quo.
In the meantime, some have suggested computing an “SVI” – a “Swing State Project Voting Index,” comparing 2004 to 2004 and 2008 to 2008. In fact, CalifornianInTexas has already gone ahead and started calculating these numbers. For the most part, these will be more favorable to Dems, as the big Kerry minus Obama splits are removed from the equation.
So, I’m asking the community: Should we use the “SVI”? Should it be in addition to the PVI? Are there any pitfalls if we do so? Any reasons not to? Let’s hear your thoughts!
My opinion is that:
(1) The SVI looks to be more useful than the PVI in determining a district’s lean, but
(2) The rest of the political world will be using PVIs, which could make it harder for SSP analysis to translate to outside readers…
(3) so primarily stick with the PVI (in charts and all), but use the SVI if it provides interesting information on write-ups and analyses.
Does Cook have an explanation for why he uses this methodology? Maybe there is a good reason we are not seeing?
Campaign Diaries
I’m willing to convert to using the SVI exclusively. Leftblogistan needs to be a thought leader, not just a bunch of people typing away from their parents’ basements.
But my training as an engineer says that we need to look at 2008 PVI side-by-side with 2008 SVI, understand which districts have more than a 5-point (arbitrary number) difference, and make a gut call on which data fits reality better.
I know this community is fully capable of completing such a thorough analysis. Let the fun begin!
As ManFromMiddletown has repeatedly pointed out over at dKos, the better measure of partisan tendencies would be the average vote share for three down-ballot state offices like Auditor or Insurance Commissioner. Because nobody knows nor much cares who these candidates are, they get votes because of their party line, not as individuals.
Basing the PVI on the Presidential vote is quick and dirty, and standardized — because not every state elects an Auditor or whatever.
But at the Presidential level, the candidate’s personalities and characteristics — like race in ’08 — can swing a lot of votes. And in the South particularly, the partisan trend may be Democratic at the state and local level and Repub at the national level. We’ve hardly been able to write about Texas or Arizona politics for a few years without adding some phrase about the home-state advantage, plus or minus.
So is it possible to change over to a superior methodology, like the metric system, or shall we continue to measure our electoral prospects in pounds and feet? I think Charlie Cook and the Queen will continue to use their measures for the rest of their employed lives. I just don’t know about the rest of us.
The best way to compare Cook’s OPVI with Swing State’s SVI is run them side by side for an election cycle. Then everyone will see which system proves the more accurate.
Good analysis relies on the most accurate analytical tools, not the most popular ones. So in that I do disagree with those who think Swing State analysis should be based on the industry standard rather than the most accurate in the industry.
My 2 cents.
Using both allows a comparison of short and medium term trends for a district at a glance.
Using only PVI ignores two cycles of Democratic uptick.
Using only SVI presumes a one-cycle result as the normal character of a district.
If the site only uses SVI, then we are basing our read of the partisan breakdown on a single cycle in which the GOP ticket was headed by a guy their base did not much care for and did not have anywhere close to an even playing field in financial resources as the economy crumbled in spectacular fashion under a sitting president of his party.
If the site uses only PVI, we overstate GOP strength based on the Bush 2004 GOTV operation which no longer exists and ignore the shift in party ID over the last few years.
In any event, for the 2010 cycle the basic flaw with SVI and PVI is each concentrates exclusively on presidential results. Off-year cycles are very different beasts when it comes to TO, especially in states which have their gubernatorial elections in Presidential years (and the usual problem of getting out Leap Year Dems generally).
Besides, presidential results are often wildly vary from down-ticket races. More useful is how a party’s candidate performs compared to party registration (where applicable).
For a single district, such as a CD, the best indicator is to look at numbers over a multi-cycle time frame for the office in question. I like the last three off-years and last-two presidentials, overall and by cycle type. But we need something more shorthand, don’t we?
Either way, PVI or SVI, you get a nice take on whether party nominee is overachieving or under-performing in comparison to the top of his/her ticket.
There is no magic bullet, no algorithm which will tell us by plugging in data which district can be swung with a little effort. Like with investments, past performance is no guarantee of future performance.
Of course, my preference is an Excel sheet listing vote totals by office (prez, gov and CD) side-by-side over the last several cycles… broken down by county. Get more out of that than a PVI. But it does take up too much space, huh?
Anyway, I like the idea of using both. Putting them side by side gives us a quick take on short and medium-term trends.
Look at it this way, if the PVU and SVI are the same in a distirct, that tells you the GOP isn’t losing a bit of their support come Hell or high water. Where the SVI number is more D than the PVI we see fertile ground.
Cook’s PVI is merely a relative value even using the old data. A D+0 seat is not a 50/50 district but is top heavy for Democrats (6-1 if my data is correct). The new Cook numbers would mean that the divide between likely Republican and likely Democratic at the House level would come at somewhere around R+5 or R+6. The data is useful but kind of screwy.
We saw some of this during the last election when people were salivating about R+1 or R+2 districts. The recognition was that they leaned Democratic (which they do).
I live in a house that’s over 40 years old that was constructed on somewhat marshy soil. The floors all tilt in a mostly uniform fashion. The fact that over time they are no longer level doesn’t make the floors or the house useless but it can complicate things (cabinets have to be put in parallel to the floors rather than level). The same thing with Cook’s PVI.
I think we should use a dual system of level (SVI) and parallel (Cook PVI). One more election like the last two (hope, hope) and dealing with the Cook PVI will be like dealing with the grade in the Grand Canyon.
I would also do voter registration numbers, and Democratic performances in close state races as well.
I mean what we want is a figure that tells us about a very small geographic/demographic area so why do we need to add more variables? Particularly when many states aren’t contested by either candidate – surely tv ads concentrated in swing states like Ohio and Florida are going to add artifacts to the PVI/SVI that isn’t there in say Idaho or Massachusetts. But maybe I have completely the wrong end of the stick. I’m prepared to be enlightened! 🙂
I don’t think Cook’s PVI or this proposed SVI would be as useful a measurement as they could be. A district with a 2004 Cook PVI of D+1 or D+2 district is actually a Republican district, since Bush won by just over 2% nationally. You’re going to see a lot more of those with Obama’s numbers because his margin of victory was much larger.
The proposed “SVI,” comparing numbers within one election, doesn’t solve this problem either. If Obama’s national win was just over 7%, every district with an SVI of up to R+7 is actually a Democratic district. But people are going to consider an R+7 district to be solidly Republican, even thought it barely went to Obama. I realize the point is to compare the district to the national average, but I’m not sure how useful that information is when predicting who’s going to win in a district. Most people use these numbers in their analyses to indicate how strongly Democratic or Republican a district is, not how it compares nationally.
The only way I can see to avoid this is to have the letter (“R” or “D”) always indicate which party won the district and the number indicate by how much. So a district with a PVI of D+2 went for Obama by 2% (51%-49%). A district with R+6 went for McCain by 6%. With the SVI (or Cook’s PVI not including 2004), these districts would be R+5 and R+13, respectively, and we would be a center-right nation.
Another option would be to average the House, Senate, and Presidential numbers, so we don’t rely exclusively on the Presidential race to define a district’s lean.
Include the PVI in brackets afterward, but SVI just seems to make more sense.
Hell, imagine how little the system would have worked after huge landslides like Reagan’s in 1984 and Nixon’s in 1972.
That said, if you have any contact details for Charlie Cook or one of his confederates, I’d be interested to hear a defence of his method.
i think whatever the methodology, it should include ’08 numbers and drop out ’00 numbers, otherwise it’s increasingly out of date.
i guess i can see the appeal for cook and maybe others in that the ’00 and ’04 national elections were close to 50-50, therefore you almost removed a complex variable and you can just say if they voted 60% for GOP in ’00 and ’04 then they are GOP +20. if you have a national election 0f 53-46 then the calculation becomes harder, but that’s no excuse for sloppiness.
but again what does nate think about this?
It’s definitely a good idea to take the lead in developing a more rational version of PVI. Look at 538- Nate didn’t get so much play by tagging along with the half-assed analyses already in use. However, if you are going to take the lead, it makes sense to try to do this carefully. Presidential votes will be way off sometimes because of home-state advantage or other special circumstances. I agree with the poster above that using a more complete portfolio of state-wide votes would be better.
It makes a lot more sense. Although, us also using PVI only is giving it undeserved credit. If we all decide it is stupid and is worth making a new system over, then why present the other?
It’s tempting to almost use 2000 no matter what as one baseline considering the calculations since the election was as close as it was. I think we’ll see funny things with the blow-out of 2008, but 2008 was aberrationally good for us. 1984 and 1988 were the last times the electorate was so tilted and that, of course, was for the GOP. Before that, we go all the way back to 1964 for us and 1972 for them. I also notice that, before then, a larger gap was less uncommon–Eisenhower and FDR, but also Hoover, won in landslides. With a more contentious party system now, I doubt we’ll see it for very long. Obama MAY be able to improve for 2012 if he does really well, but this may be a once in a half-generation sorta thing. PVI tells us which districts flow with the tides and which are stubbornly resistant. SVI would also be a good microtargeting tool, but something tells me that it would just be an under/overperformance scale and wouldn’t be as cross-comparable.
KISS. I guess I don’t see the value of seeing how much “more republican” or “more demcoratic” a district is than a national average. the closest we can figure how democratic a place is IN AN ABSOLUTE SENSE is of more value.
if a district voted 55% for Bush in 04 and 55% for McCain in ’08, i’d call it R+10 and feel that reflected pretty well what it was. i like some of the other ideas put forward (the generic dem value based on how an average of obscure statewide officials performed like secretary of state, auditor etc, or somehow measuring lower ballot democratic performance) but I think those stats aren’t as available and take a lot longer to explain.
i think the biggest arguments against the old (’00 and ’04) PVIs is the demographic changes in these districts. i read recently that if the electorate was the same as it was in 1992, McCain would have won easily, but because of larger numbers of young people and people of color it was a whole different story. these trends are going to continue and old numbers will miss the story.
to the discussion. On the one hand, I like having PVI (or something like it) because it lets you sum up a district in one number. (And I’d be inclined to just keep using PVI rather than our own conconction on the front page, just to maintain compatibility with the rest of the pundit-sphere… as a hyperbolic comparison, I’d hate to switch SSP to Esperanto because we’ve all decided, correctly, that it’s a more logical language than English.)
But on the other hand, I’m getting kind of tired of PVI, as it’s just one dimension out of many in describing a district, and one we shouldn’t fetishize more than we should. So, for me at least, the discussion of whether a district should be an R+13 or an R+15, depending on what baseline we use, is more of a distraction than anything.
Here’s a case in point. Think of all the different districts that clock in at (old PVI of) R+3. (I guess I randomly grabbed that number because I’ve been thinking about NY-20.) These districts have little in common. Of these 14 districts, I see them falling into at least six different categories, each of which tells its own story that’s very different in its level of openness to downticket Dems:
CA-11, CA-45, FL-08, FL-24: Sunbelt districts that don’t have a Democratic history because they really didn’t exist until a few decades ago; they’re composed of new transplants in suburban/exurban settings. Trending Democratic, and also increasingly willing to vote Dem downticket, although often in reaction to terrible GOP incumbents.
IL-06, MN-02: Midwestern middle-class suburban districts that are increasingly willing to vote Dem at the top of the ticket but are still unwilling to part with conservative GOPers downticket.
NY-20, NY-26: Northeastern suburban/rural districts where there’s still a Rockefeller Republican tradition, esp. downballot, but a favorable overall trend toward Dems.
NC-02, NC-07, NC-08: Lowland southern rural districts where there’s enough of a tradition of Yellow Dog Democratic voting plus a sizable African-American minority that Democrats can thrive downballot.
OH-03, PA-04: Rust Belt districts that mix urban and rural components; ancestrally Democratic but trending away from us as unionists die off, but still amenable to pro-labor socially conservative Dems.
TN-04: Appalachian rural district with a history of voting Democratic downballot (and upticket too until lately), but trending away from us fast at all levels.
So, I was thinking if we really want to go large, and contribute something to the broader blogosphere, that goes beyond a purportedly more accurate version of PVI, that really affects the larger conversation about what a district can and can’t support, maybe we should try categorizing districts in terms of 20 or 25 typologies. Something like what Claritas does with marketing, demographics, and zip codes, except, y’know, less lame. For instance, I’m sure we can think of other districts that fit easily into each of the six categories I have above. Sounds like a lot of guesswork initially, but if we find variables that truly work, we could actually do some regression analysis and make sure that it meets SSP’s usual data-driven standards.
The question: How do we get it?
What we need is a way to estimate a sort of generic D vs. R matchup for each district (or state, or state senate/house district, or county, or town, or even precinct). Then we can figure out how well our candidate did, versus their candidate.
Ideally, there should be a way of finding this out. For example, we could ask everyone in the district who they’d vote for, generic Democrat or generic Republican, for a given position, and then ask them whether they were certain of their choice. If certain, we’d put them in the “solid D” or “solid R” bin, and if uncertain, we’d still get an idea of how big the swing block is, and whether they lean D or R.
Now, even if we do an abstract version of this, via polling, we (or anyone else) would still need to put in a ton of groundwork into gathering this data.
If we can approximate this data via some other measure, that would be quite useful. For example, we could try to average out the performance of candidates to that seat in the past, but seeing as we are trying to gauge exactly that, this would be a wolf-guarding-the-sheep situation. Another method is to compare to results up and down the ballot. While this can be skewed by differing perceptions of the parties at different levels of government (such as Democrats being popular locally in heavily conservative areas), this is nevertheless the basic idea behind Cook’s Partisan Voting Index, on which our “SSP PVI” is based.
Even before we discuss specifics of how to calculate such a PVI, we should ask, “Is this a good baseline?” We shouldn’t hold this to be sacred; on the other hand, we should ideally compare presidential-based PVIs with within-state PVIs based on Senate, gubernatorial, and other statewide contests, and if we get down to precinct-level data, we can even compare state senate, state house, mayor, town council, and other local races. The presidential, gubernatorial, and senate results, however, are perhaps the easiest to sort out of all these data sets.
However, they are also the highest-profile results, easily influenced by personalities and unique characteristics of individual candidates. Perhaps lower-profile statewide offices may be useful…
And don’t forget that people undervote–not all presidential, gubernatorial, senate, or other statewide office votes will include votes for Representative, State Senator/Representative, etc..
I think the SVI is better. SVI will average to 0, and PVI will not.
But it could be even better. One thing is to include the native son effect for POTUS and VPOTUS. That would make it a more accurate gauge of true feeling, and it would be relatively easy to do.
Given that the data is already entered, we could also look at trends over time.